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Abstract 

The  stability  and  internal  consistency  of  Spence  and  Kelmrcich's 
sex-role  scales,  the  Attitudes  toward  Women  Scale  (AWS)  and  Personal 
Attributes  Questionnaire  (PAQ) ,  are  reported.  An  entire  first  year 
class  of  1,007  male  and  78  female  cadets  at  the  U.  S.  Military 
Academy  were  given  a  battery  of  psychological  tests  before  and  after 
cadet  basic  training,  a  two  and  one-half  month  period.  The  AWS  and 
PAQ  proved  to  be  highly  reliable,  comparable  to  other  frequently  used 
psychological  tests.  This  psychometric  information  encourages  researchers 
to  make  further  use  of  these  sex-role  scales. 


Reliability  of  the  Attitudes  Toward  Women  Ecale  (Ah’S) 
and  the  Personal  Attributes  Questionnaire  (PAQ)^ 

In  recent  years,  we  have  seen  a  dramatic  increase  in  the  volume  of 
empirical  research  directed  toward  the  issue  of  sex-roles.  Perhaps  the  most 
telling  evidence  for  this  trend  is  the  presence  of  this  journal,  devoted 
exclusively  to  the  topic.  One  important  catalyst  to  this  high  level  of 
activity  has  been  the  development  of  research  instruments  specifically 
designed  to  measure  concepts  Important  to  understanding  sex-role  phenomena. 

Two  of  the  most  extensively  used  instruments  of  this  type  are  the  Attitudes 
toward  Women  Scale  (AWS)  and  the  Personal  Attributes  Questionnaire  (PAQ) 

(Spence  &  Helmreich,  1973). 

The  AWS  measures  attitudes  concerning  the  rights,  roles,  obligations,  and 
privileges  that  women  should  have  in  modern  society.  It  provides  scores  a3.o:is 
a  continuum  ranging  from  endorsement  of  traditional  sex-roles  to  an  egalitarian 
view  of  the  roles  of  women  and  men. 

The  PAQ  is  a  self-concept  scale.  Items  in  this  scale  can  be  classified 
into  three  general  categories:  (a)  characteristics  that  are  generally 
regarded  as  being  highly  masculine  and  that  are  desired  by  both  men  and  women 
(PAQ  M) ,  (b)  qualities  that  are  more  stereotypically  ascribed  as  feminine  and 
that  are  positively  valued  by  both  women  and  men  (PAQ  F) ,  and  (c)  personality 
attributes  that  are  desirable  for  one  gender  and  undesirable  for  the  other 
(PAQ  M-F) .  The  tt-F  score  is  like  a  traditional,  unidimensional  masculinity- 
femininity  scale  with  high  scores  indicating  masculinity  and  low  scores  reflect¬ 
ing  femininity.  The  subscales  were  designed,  and  shewn  to  be,  independent. 
Spence  and  Helmreich  (1978)  use  the  combination  of  each  respondent's  K  and 
F  scores  to  classify  that  person  as  nndrorvnous,  masculine,  feminine,  or 
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ur.dif ferentlatcd  in  his  cr  her  self  concept. 

In  their  several  publications  describing  these  two  .instruments,  Spence 
and  llalnreich  have  presented  surprisingly  limited  reliability  information 
(Spence  &  Hclmreich,  1972,  1973;  Spence,  Helnreich  A  Stapp,  1973,  1974).  For 
the  Ah’S,  they  reported  only  one  estimate  of  internal  consistency;  coefficient 
alpha  was  .91  for  a  15-itc:;!  version  of  the  scale  for  a  college  student  sample 
of  unspecified  size  (Spence  &  Helnreich,  1978,  p.  39).  Also,  only  one 
estimate  of  internal  consistency  was  available  for  the  PAQ;  Spence  and 
Helmreich  (1978,  p.  35)  reported  coefficient  alpha  values  of  .85,  .82,  and 
.78  for  the  PAQ  M,  PAQ  F,  and  PAQ  M-F,  respectively,  in  a  college  student 
sample  of  unspecified  sine.  No  test-retest  reliability  information  has  been 
provided  for  either  the  AWS  or  the  PAQ.  The  objective  of  the  present  article 
is  to  further  examine  the  reliability  of  these  two  measures. 

The  United  States  Military  Academy's  Project  Athena  (Vitters,  Note  1) 
provided  an  especially  good  opportunity  to  examine  the  reliability  of  these 
two  measures.  The  purpose  of  Project  Athena  is  to  examine  the  impact  of  admit¬ 
ting  women  to  the  military  academy  on  both  the  institution  and  individual 
cadets.  Part  of  this  research  project  has  involved  repeated  testing  of  the 
cadet  population  with  a  number  of  standard  psychological  measures,  including 
the  two  Spence  and  Helrareich  sex-role  instruments.  This  longitudinal  design 
provided  the  data  necessary  to  assess  the  test-retest  reliability  of  these 
measures  as  well  as  their  internal  consistency  at  different  points  in  time. 

The  inclusion  of  several  other  well-known  psychological  measures  in  the 
Project  Athena  design  provided  an  important  comparative  basis  for  evaluating 
the  reliability  of  the  two  sex-role  measures.  This  variable  net  included 
both  attitudinal  and  personality  scales.  The  personality  measures  were  the 
Rotter  Locus  of  Control  (I-F.),  the  Tennessee  Self-Concept  Scale  (TSC3),  and 
Saracon's  Te3t  Anxiety  Scale  (TAS) .  There  was  also  an  attitudinal  measure  of 
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organize  clonal  co.-nm  i  ti.icnt . 


Materials 


Method 


As  several  different  versions  of  the  Spence  and  Helv.vcich  measures  are 
now  available,  it  is  important  to  identify  precisely  the  scales  employed. 

The  25-item  Ah'S,  identified  by  Spence  and  Helsireich  (1972)  ns  the  short  version 
of  the  scale,  was  used  throughout  this  study.  High  scores  indicate  an  egali¬ 
tarian  view,  while  low  scores  represent  traditional  attitudes  concerning  the 
role  of  women. 

The  PAQ  used  in  this  study  was  the  24-item  scale  described  by  Spence  and 
Relnreich  (1978).  On  the  basis  of  individuals'  M  and  F  scores,  they  ware 
classified  as  androgynous  (high  M,  high  F) ,  masculine  (high  M,  low  I'),  feminine 
(low  M,  high  F),  or  undifferentiated  (low  M,  low  F).  The  r.orne  established 
by  Spence  and  ileltr.raieh  (1978)  using  college  students  were  reed  to  define 
high  and  lew  M  end  F  scores  (medians  were  21  and  23,  respectively). 

In  addition  Co  the  two  sex-role  measures,  the  design  included  the  follow¬ 
ing  instruments: 

Tennessee  Self-Coneept  Scale  (TSCS) .  From  this  multidimensional  measure 
of  self-concept,  only  the  composite  "total  positive"  score  is  reported  here. 
Persons  with  high  scores  on  this  subscale  of  the  TSCS  feel  that  they  are 
worthwhile  persons,  and  they  feel  and  act  confidently  (Fitts,  1965).  The 
scores  range  from  100  to  500. 

Test  Anxiety  Seale  (TAS) .  Samson’s  (1962)  16-item  scale  provides  a 
measure  of  anxiety  in  teat-taking  and  related  situations  involving  performance 
evaluation.  High  scores  on  this  scale  indicate  high  levels  of  reported  anxiety. 

Rotter  T-E.  The  29-ir.cm  Locus  of  Control  (T-K)  Scale  measures  generalized 


expectancies  for  internal  versus  external  control  over  life  events  (• '•tter, 
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1965).  The  scale  in  scored  so  that  high  scores  represent,  belief.'.  in 
external  locus  of  control  of  reinforcement. 

Organizational  Commi luv-nt  ()*).  The  15-1  ten  measure  of  organizational 
commitment,  developed  by  Porter,  Steers,  Mowday,  and  Bon linn  (1574)  was 
modified  for  the  specific  concerns  of  West  Point.  High  scores  on  the  attitude 
scale  represent  a  high  level  of  support,  endorsement,  and  commitment  to  the 
goals  ami  values  of  the  academy. 

Procedure 

The  basic  design  of  Project  Athena  called  for  administration  of  the 
tests  in  early  July,  and  again  in  mid-September,  of  1976  to  1,00/  male  and  78 
female  freshman  cadets.  Cadet  Basic  Training,  a  stressful  introduction  to 
military  life,  intervened  between  these  administrations.  These  numbers 
reflect  coma  attrition  (initially*  there  vitv  1,221  men  and  119  mean). 

In  addition,  the  PAQ  was  administered  to  this  entire  class  of  cadets  at 
five  different  times  throughout  a  tvo-vear  period. 

Results 

The  AWS  and  PAQ  tests  are  reliable,  as  shown  by  the  two  and  one-half 
month  test-retest  reliabilities  and  coefficient  alpha  measures  of  internal 
consistency  listed  in  Table  1. 


Insert  Table  1  about  here 


A  comparison  of  these  reliability  coefficients  with  those  of  the  other 
personality  and  attitudinal  tests  administered  at  the  same  times  indicates 
that  the  sex-role  measures  are  as  reliable  as  these  other  generally  used 
psychological  tests. 


Reliability  of  the.  A'.IS  ami  P..Q  5 

The  tent-re test  rull ji'litv  coefficients  I..;  i lift  rupcitml  P/»0 
testings,  given  in  TaMc  2,  provide  ;.%■  n  stronger  evidence  of  Li.’  r. Lr’  Jlity 
of  scores  on  these  scales.  Over  a  period  of  two  years,  chose  scores  vere 
quice  stable. 


Insert  Table  2  about  here 


As  mentioned  earlier,  PAQ  scores  have  been  used  to  categorize  individuals 
as  androgynous,  masculine,  feminine,  or  undifferentiated.  Cadets  were  classi¬ 
fied  into  one  of  these  four  categories,  first  using  their  1’AQ  scores  at  tire  1 
and  again  at  tine  2.  Tab! a  3  shows  the  consistency  of  these  classifications 
for  male  and  female  respondents.  The  principal  diagonal  shows  the  percentage 


Insert  Table  3  about  b.cra 


of  individuals  placed  in  each  category  at  time  1  who  remained  in  that  category 
after  a  second  testing  over  twro  months  later.  For  example,  70%  of  those 
men  classified  as  androgynous  at  time  1  were  classified  similarly  at  tine  2. 
The  percentages  are  similar  for  men  and  women.  Overall,  57%  of  all  the  men 
and  54%  of  all  the  women  remained  in  the  same  sex-role  category  across 
testings.  The  rows  of  the  table  show’  shifts  in  categorization.  Both  women 
and  men  tended  to  shift  to  the  androgynous  and  masculine  categories  after 
basic  training.  Such  a  shift  is  not  surprising  given  the  high  physical 
performance  demands  of  basic  training. 

It  is  also  interesting  to  note  changes  in  these  measures  across  testings 


separated  bv  Cadet  Basic  Training  (C3T) 


Men  became  more  traditional  in  their 
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attitudes  toward  woman  go  in;;  thtouph  basic  trr  IrAr.y,  with  chr.ir  female 
classmates  (see  Table  1).  Hun  rated  th  uv-c-ives  ns  being  both  :.;ov e  r,.'- culin-.* 
(FAQ  M)  and  r.’ore  feminine  (PAQ  F)  after  training,;  while  wo ran  only  indicated 
increases  in  their  ratings  of  femininity.  After  training,  both  women  and 
nan  exhibited  increases  in  their  seif-concept  (TSCS) ,  organizational  commit¬ 
ment  (P),  and  the  intemality  of  their  locus  of  control  (P.OTTFH.) .  Ken 
reported  that  their  test  anxiety  (TAS)  increased  after  training.  In  con¬ 
trast,  women  consistently  reported  high  levels  of  test  anxiety,  both  before 
and  after  training.  Both  the- sex-role  and  other  tests  were  significantly 
influenced  by  the  events  that  intervened  between  testings.  Priest,  Prince 
and  Vitters  (Note  2)  have  described  in  detail  these,  and  other  changes,  related 
to  the  cadet  training  experience. 

Discussion 

The  decision  to  use  any  psychological  measure  requires  adequate  informa¬ 
tion  about  both  its  reliability  and  validity.  The  reliability  assessment 
reported  in  the  present  article  suggests  that  the  reliability  of  both  the 
AWS  and  PAQ  is  good.  For  both  measures,  the  internal  consistency  coeffic¬ 
ients  were  very  high.  Furthermore,  the  test-retest  reliability  coefficients 
suggest  that  both  these  scales  measure  relatively  stable  characteristics  of 
the  respondents,  especially  when  one  considers  the  quite  drariatic  nature  of 
the  experience  during  the  test-retest  interval.  These  reliability  data,  in 
conjunction  with  the  construct  validity  data  reported  by  Spence  and  He.lmreich 
(1078),  should  encourage  researchers  to  include  these  scales  in  future 
research  requiring  either  a  measure  of  attitudes  toward  women's  role  in 
society  (the  AWS)  or  self-perceptions  of  masculinity  and  femininity  (the 
PAQ) . 


Reliability  of  the  Al'S  or..',  PAq 
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isure:  Test-Retest  Coefficient 

Subjects  Reliability  Alpha  Means 
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Table  2 


Test' 

-Retest  Roll; 

ibiliti.es  of 

I’AfiM  and  PAQF 

PACK 

Time  1 

Time  2 

Time  3 

Time  4 

Time 

Time  1 

1.00 

.62 

.45 

.62 

.67 

Time  2 

.58 

1.00 

.52 

.70 

.70 

Time  3 

.49 

.48 

1.00 

.63 

.62 

Tine  4 

.55 

.55 

.64 

1.00 

.68 

Time  5 

.41 

.46 

.58 

.61 

1.00 

PAQF 

Time  1 

Time  2 

Time  3 

Time  4 

Time 

Time  1 

1.00 

.67 

.48 

.57 

.44 

Time  2 

.54 

1.00 

.47 

.63 

.62 

Time  3 

.46 

.51 

1.00 

.65 

.41 

Tine  4 

.46 

.46 

.56 

1.00 

.41 

Time  5 

.43 

.45 

.51 

.53 

1.00 

Note.— The  correlations  for  female  cadets  are  listed  above  the 
diagonal  and  fcr  males,  below  it.  Time  1  »  June  1976  (pre-CAT) ,  Time  2  ** 
August  1976  (post-CBT) ,  Time  3  =  April  1977,  Time  4  «  August  1977,  Time  5  “ 
August  1973. 


Percentages  of  Cadets  Classified  by  Sex  Type  Before  and  After  Cadet  Basic  Training 


roses  arc  the  number  of  respondents  ir.  each  cell 


